424 research outputs found

    Magnetic miniband and magnetotransport property of a graphene superlattice

    Full text link
    The eigen energy and the conductivity of a graphene sheet subject to a one-dimensional cosinusoidal potential and in the presence of a magnetic field are calculated. Such a graphene superlattice presents three distinct magnetic miniband structures as the magnetic field increases. They are, respectively, the triply degenerate Landau level spectrum, the nondegenerate minibands with finite dispersion and the same Landau level spectrum with the pristine graphene. The ratio of the magnetic length to the period of the potential function is the characteristic quantity to determine the electronic structure of the superlattice. Corresponding to these distinct electronic structures, the diagonal conductivity presents very strong anisotropy in the weak and moderate magnetic field cases. But the predominant magnetotransport orientation changes from the transverse to the longitudinal direction of the superlattice. More interestingly, in the weak magnetic field case, the superlattice exhibits half-integer quantum Hall effect, but with large jump between the Hall plateaux. Thus it is different from the one of the pristine graphene.Comment: 7 pages, 5 figure

    Low Rank Approximation of Binary Matrices: Column Subset Selection and Generalizations

    Get PDF
    Low rank matrix approximation is an important tool in machine learning. Given a data matrix, low rank approximation helps to find factors, patterns and provides concise representations for the data. Research on low rank approximation usually focus on real matrices. However, in many applications data are binary (categorical) rather than continuous. This leads to the problem of low rank approximation of binary matrix. Here we are given a dΓ—nd \times n binary matrix AA and a small integer kk. The goal is to find two binary matrices UU and VV of sizes dΓ—kd \times k and kΓ—nk \times n respectively, so that the Frobenius norm of Aβˆ’UVA - U V is minimized. There are two models of this problem, depending on the definition of the dot product of binary vectors: The GF(2)\mathrm{GF}(2) model and the Boolean semiring model. Unlike low rank approximation of real matrix which can be efficiently solved by Singular Value Decomposition, approximation of binary matrix is NPNP-hard even for k=1k=1. In this paper, we consider the problem of Column Subset Selection (CSS), in which one low rank matrix must be formed by kk columns of the data matrix. We characterize the approximation ratio of CSS for binary matrices. For GF(2)GF(2) model, we show the approximation ratio of CSS is bounded by k2+1+k2(2kβˆ’1)\frac{k}{2}+1+\frac{k}{2(2^k-1)} and this bound is asymptotically tight. For Boolean model, it turns out that CSS is no longer sufficient to obtain a bound. We then develop a Generalized CSS (GCSS) procedure in which the columns of one low rank matrix are generated from Boolean formulas operating bitwise on columns of the data matrix. We show the approximation ratio of GCSS is bounded by 2kβˆ’1+12^{k-1}+1, and the exponential dependency on kk is inherent.Comment: 38 page

    Algorithmic Regularization in Model-free Overparametrized Asymmetric Matrix Factorization

    Full text link
    We study the asymmetric matrix factorization problem under a natural nonconvex formulation with arbitrary overparametrization. The model-free setting is considered, with minimal assumption on the rank or singular values of the observed matrix, where the global optima provably overfit. We show that vanilla gradient descent with small random initialization sequentially recovers the principal components of the observed matrix. Consequently, when equipped with proper early stopping, gradient descent produces the best low-rank approximation of the observed matrix without explicit regularization. We provide a sharp characterization of the relationship between the approximation error, iteration complexity, initialization size and stepsize. Our complexity bound is almost dimension-free and depends logarithmically on the approximation error, with significantly more lenient requirements on the stepsize and initialization compared to prior work. Our theoretical results provide accurate prediction for the behavior gradient descent, showing good agreement with numerical experiments.Comment: 30 pages, 7 figure

    Rethinking Lipschitz Neural Networks and Certified Robustness: A Boolean Function Perspective

    Full text link
    Designing neural networks with bounded Lipschitz constant is a promising way to obtain certifiably robust classifiers against adversarial examples. However, the relevant progress for the important β„“βˆž\ell_\infty perturbation setting is rather limited, and a principled understanding of how to design expressive β„“βˆž\ell_\infty Lipschitz networks is still lacking. In this paper, we bridge the gap by studying certified β„“βˆž\ell_\infty robustness from a novel perspective of representing Boolean functions. We derive two fundamental impossibility results that hold for any standard Lipschitz network: one for robust classification on finite datasets, and the other for Lipschitz function approximation. These results identify that networks built upon norm-bounded affine layers and Lipschitz activations intrinsically lose expressive power even in the two-dimensional case, and shed light on how recently proposed Lipschitz networks (e.g., GroupSort and β„“βˆž\ell_\infty-distance nets) bypass these impossibilities by leveraging order statistic functions. Finally, based on these insights, we develop a unified Lipschitz network that generalizes prior works, and design a practical version that can be efficiently trained (making certified robust training free). Extensive experiments show that our approach is scalable, efficient, and consistently yields better certified robustness across multiple datasets and perturbation radii than prior Lipschitz networks. Our code is available at https://github.com/zbh2047/SortNet.Comment: 37 pages; to appear in NeurIPS 2022 (Oral

    Content-based Controls For Music Large Language Modeling

    Full text link
    Recent years have witnessed a rapid growth of large-scale language models in the domain of music audio. Such models enable end-to-end generation of higher-quality music, and some allow conditioned generation using text descriptions. However, the control power of text controls on music is intrinsically limited, as they can only describe music indirectly through meta-data (such as singers and instruments) or high-level representations (such as genre and emotion). We aim to further equip the models with direct and content-based controls on innate music languages such as pitch, chords and drum track. To this end, we contribute Coco-Mulla, a content-based control method for music large language modeling. It uses a parameter-efficient fine-tuning (PEFT) method tailored for Transformer-based audio models. Experiments show that our approach achieved high-quality music generation with low-resource semi-supervised learning, tuning with less than 4% parameters compared to the original model and training on a small dataset with fewer than 300 songs. Moreover, our approach enables effective content-based controls, and we illustrate the control power via chords and rhythms, two of the most salient features of music audio. Furthermore, we show that by combining content-based controls and text descriptions, our system achieves flexible music variation generation and style transfer. Our source codes and demos are available online

    Nonlinear dynamics of full-range CNNs with time-varying delays and variable coefficients

    Get PDF
    In the article, the dynamical behaviours of the full-range cellular neural networks (FRCNNs) with variable coefficients and time-varying delays are considered. Firstly, the improved model of the FRCNNs is proposed, and the existence and uniqueness of the solution are studied by means of differential inclusions and set-valued analysis. Secondly, by using the Hardy inequality, the matrix analysis, and the Lyapunov functional method, we get some criteria for achieving the globally exponential stability (GES). Finally, some examples are provided to verify the correctness of the theoretical results

    A Validation Approach to Over-parameterized Matrix and Image Recovery

    Full text link
    In this paper, we study the problem of recovering a low-rank matrix from a number of noisy random linear measurements. We consider the setting where the rank of the ground-truth matrix is unknown a prior and use an overspecified factored representation of the matrix variable, where the global optimal solutions overfit and do not correspond to the underlying ground-truth. We then solve the associated nonconvex problem using gradient descent with small random initialization. We show that as long as the measurement operators satisfy the restricted isometry property (RIP) with its rank parameter scaling with the rank of ground-truth matrix rather than scaling with the overspecified matrix variable, gradient descent iterations are on a particular trajectory towards the ground-truth matrix and achieve nearly information-theoretically optimal recovery when stop appropriately. We then propose an efficient early stopping strategy based on the common hold-out method and show that it detects nearly optimal estimator provably. Moreover, experiments show that the proposed validation approach can also be efficiently used for image restoration with deep image prior which over-parameterizes an image with a deep network.Comment: 29 pages and 9 figure

    A model-data asymptotic-preserving neural network method based on micro-macro decomposition for gray radiative transfer equations

    Full text link
    We propose a model-data asymptotic-preserving neural network(MD-APNN) method to solve the nonlinear gray radiative transfer equations(GRTEs). The system is challenging to be simulated with both the traditional numerical schemes and the vanilla physics-informed neural networks(PINNs) due to the multiscale characteristics. Under the framework of PINNs, we employ a micro-macro decomposition technique to construct a new asymptotic-preserving(AP) loss function, which includes the residual of the governing equations in the micro-macro coupled form, the initial and boundary conditions with additional diffusion limit information, the conservation laws, and a few labeled data. A convergence analysis is performed for the proposed method, and a number of numerical examples are presented to illustrate the efficiency of MD-APNNs, and particularly, the importance of the AP property in the neural networks for the diffusion dominating problems. The numerical results indicate that MD-APNNs lead to a better performance than APNNs or pure data-driven networks in the simulation of the nonlinear non-stationary GRTEs
    • …
    corecore